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The Stochastic Calculus of Looping Sequences is suitable to describe the evolution of microbiolog- 
ical systems, taking into account the speed of the described activities. We propose a type system 
for this calculus that models how the presence of positive and negative catalysers can modify these 
speeds. We claim that types are the right abstraction in order to represent the interaction between el- 
ements without specifying exactly the element positions. Our claim is supported through an example 
modelling the lactose operon. 

1 Introduction 

The Calculus of Looping Sequences (CLS for short) |@l|5l[T9l, is a formalism for describing biological 
systems and their evolution. CLS is based on term rewriting, given a set of predefined rules modelling 
the activities one would like to describe. The model has been extended with several features, such as 
a commutative parallel composition operator, and some semantic means, such as bisimulations SIT], 
which are common in process calculi. This permits to combine the simplicity of notation of rewrite 
systems with the advantage of a form of compositionality. A Stochastic version of CLS (SCLS for 
short) is proposed in |6j. Rates are associated with rewrite rules in order to model the speed of the 
described activities. Therefore, transitions derived in SCLS are driven by a rate that models the parameter 
of an exponential distribution and characterizes the stochastic behaviour of the transition. The choice 
of the next rule to be applied and of the time of its application is based on the classical Gillespie's 
algorithm [15]. 

Defining a stochastic semantics for CLS requires a correct enumeration of all the possible and distinct 
ways to apply each rewrite rule within a term. A single pattern may have several, though isomorphic, 
matches within a CLS term. In this paper, we simplify the counting mechanism used in ||6l by imposing 
some restrictions on the patterns modelling the rewrite rules. Each rewrite rule states explicitly the types 
of the elements whose occurrence are able to speed-up or slow-down a reaction. The occurrences of the 
elements of these types are then processed by a rate function (instead of a rate constant) which is used 
to compute the actual rate of a transition. We show how we can define patterns in our typed stochastic 
framework to model some common biological activities, and, in particular, we underline the possibility to 
combine the modelling of positive and negative catalysers within a single rule by reproducing a general 
case of osmosis. 

Finally, as a complete modelling application, we show the expressiveness of our formalism by de- 
scribing the lactose operon in Escherichia Coli. 
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Summary The remainder of this paper is organized as follows. In Section [2] we formally recall the 
Calculus of Looping Sequence. In Section [3] we introduce our typed stochastic extension and we give 
some guidelines for the modelling of biological systems. In Sections 0] we use our framework to model 
the lactose operon of Escherichia Coli. Finally, in Section [5] we draw our conclusions and we present 
some related work. 



2 The Calculus of Looping Sequences 

In this section we recall the Calculus of Looping Sequences (CLS). CLS is essentially based on term 
rewriting, hence a CLS model consists of a term and a set of rewrite rules. The term is intended to 
represent the structure of the modelled system, and the rewrite rules to represent the events that may 
cause the system to evolve. 

We start with defining the syntax of terms. We assume a possibly infinite alphabet S of symbols 
ranged over by a,b,c,. . .. 

Definition 2.1 (Terms) Terms T and sequences S of CLS are given by the following grammar: 



T ::= 5 
5 ::= £ 



(S) L \T \ T | T 
a ! S -S 



where a is a generic element of S, and £ represents the empty sequence. We denote with the infinite 
set of terms, and with the infinite set of sequences. 

In CLS we have a sequencing operator _ • _, a looping operator (_) L , a parallel composition operator 
_ | _ and a containment operator _J _. Sequencing can be used to concatenate elements of the alphabet <f . 
The empty sequence e denotes the concatenation of zero symbols. A term can be either a sequence or a 
looping sequence (that is the application of the looping operator to a sequence) containing another term, 
or the parallel composition of two terms. By definition, looping and containment are always applied 
together, hence we can consider them as a single binary operator (_) L J _ which applies to one sequence 
and one term. 

The biological interpretation of the operators is the following: the main entities which occur in 
cells are DNA and RNA strands, proteins, membranes, and other macro-molecules. DNA strands (and 
similarly RNA strands) are sequences of nucleic acids, but they can be seen also at a higher level of ab- 
straction as sequences of genes. Proteins are sequence of amino acids which usually have a very complex 
three-dimensional structure. In a protein there are usually (relatively) few subsequences, called domains, 
which actually are able to interact with other entities by means of chemical reactions. CLS sequences 
can model DNA/RNA strands and proteins by describing each gene or each domain with a symbol of 
the alphabet. Membranes are closed surfaces, often interspersed with proteins, which may contain some- 
thing. A closed surface can be modelled by a looping sequence. The elements (or the subsequences) of 
the looping sequence may represent the proteins on the membrane, and by the containment operator it 
is possible to specify the content of the membrane. Other macro-molecules can be modelled as single 
alphabet symbols, or as short sequences. Finally, juxtaposition of entities can be described by the parallel 
composition of their representations. 

Brackets can be used to indicate the order of application of the operators, and we assume (_) L J _ to 
have precedence over _| _ In Figure Q] we show some examples of CLS terms and their visual represen- 
tation, using (S) L as a short-cut for (S) L J £. 

In CLS we may have syntactically different terms representing the same structure. We introduce a 
structural congruence relation to identify such terms. 
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Figure 1: (i) represents (a-b-c) L ; (ii) represents (a ■ b • c) L j (d ■ e) L ; (iii) represents 
(a-b-c) L \((d-e) L \f.g). 

Definition 2.2 (Structural Congruence) The structural congruence relations =s and =j are the least 
congruence relations on sequences and on terms, respectively, satisfying the following rules: 

S l -{S 2 -S 3 )= s (S l -S 2 )-S 3 S-e= s e-S= s S 
Si =s S 2 implies Si =7- S 2 and (Si) L \ T = j [S^ \ T 
T 1 \T 2 =tT 2 \T 1 Ti\(T 2 \T 3 )= T {Ti\T 2 )\T 3 T\e= T T 
(s) L \s= T s (S 1 -S 2 ) l \T= t (S 2 -Si) l \T 

Rules of the structural congruence state the associativity of • and | , the commutativity of the latter 
and the neutral role of e. Moreover, axiom (Si • S 2 ) L J T =j (S 2 ■ Si) L J T says that looping sequences can 
rotate. In the following, for simplicity, we will use = in place of =j. 

Rewrite rules will be defined essentially as pairs of terms, with the first term describing the portion 
of the system in which the event modelled by the rule may occur, and the second term describing how 
that portion of the system changes when the event occurs. In the terms of a rewrite rule we allow the 
use of variables. As a consequence, a rule will be applicable to all terms which can be obtained by 
properly instantiating its variables. Variables can be of three kinds: two of these are associated with 
the two different syntactic categories of terms and sequences, and one is associated with single alphabet 
elements. We assume a set of term variables STY ranged over by X,Y,Z, . . ., a set of sequence variables 

ranged over by x,y,z, ■ ■ ., and a set of element variables 3£ ranged over by x,y,z, All these sets 

are possibly infinite and pairwise disjoint. We denote by f the set of all variables, "V = U U 9£ , 
and with % a generic variable of "V . Hence, a pattern is a term that may include variables. 

Definition 2.3 (Patterns) Patterns P and sequence patterns SP o/CLS are given by the following gram- 
mar: 

P ::= SP (SP) L \P I P\P • X 
SP ::= e \ a SP-SP x \ x 

where a is a generic element of S, and X ,x and x are generic elements of ' S/"f ', ,5^^ and X , respectively. 
We denote with & the infinite set of patterns. 

We assume the structural congruence relation to be trivially extended to patterns. An instantiation is 
a partial function a : Y — > 2? . An instantiation must preserve the type of variables, thus for X € STY ,x € 
yV and i£ f we have o(X) £ o(x) e y and o(x) <G S, respectively. Given P £ g?, with Pa we 
denote the term obtained by replacing each occurrence of each variable jGf appearing in P with the 
corresponding term o(%). With E we denote the set of all the possible instantiations and, given P E 
with Var(P) we denote the set of variables appearing in P. Now we define rewrite rules. 

Definition 2.4 (Rewrite Rules) A rewrite rule is a pair of patterns (Pi,P 2 ), denoted with Pi 1— >P 2 , where 
Pi , P 2 € &>, Pi ^ £ and such that Var(P 2 ) C Var(Pi ). 
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A rewrite rule P\ \— > P 2 states that a term P\0, obtained by instantiating variables in P\ by some 
instantiation function a, can be transformed into the term P2O. We define the semantics of CLS as a 
transition system, in which states correspond to terms, and transitions correspond to rule applications. 

We define the semantics of CLS by resorting to the notion of contexts. 

Definition 2.5 (Contexts) Contexts C are defined as: 

C::=D I C\T \ T\C | (S) L \C 

where and S G 5?. The context □ is called the empty context. We denote with ^ the infinite set 

of contexts. 

By definition, every context contains a single hole □. Let us assume with C[T] we denote 

the term obtained by replacing □ with T in C. The structural equivalence is extended to contexts in the 
natural way (i.e. by considering □ as a new and unique symbol of the alphabet $). 

Rewrite rules can be applied to terms only if they occur in a legal context. Note that the general 
form of rewrite rules does not permit to have sequences as contexts. A rewrite rule introducing a parallel 
composition on the right hand side (as a ^ b \ c) applied to an element of a sequence (e.g., m-a-m) would 
result into a syntactically incorrect term (in this case m ■ (b | c) ■ m). To modify a sequence, a pattern 
representing the whole sequence must appear in the rule. For example, rule a-x 1— > a \x can be applied 
to any sequence starting with element a, and, hence, the term a -b can be rewritten as a \ b, and the term 
a-b-c can be rewritten as a \ b-c. 

The semantics of CLS is defined as follows. 

Definition 2.6 (Semantics) Given a finite set of rewrite rules the semantics of CLS is the least rela- 
tion closed with respect to = and satisfying the following rule: 

Pi^P 2 £^ ae£ P\o^e Ce^ 
As usual we denote with — >* the reflexive and transitive closure of — >. 

Given a set of rewrite rules the behaviour of a term T is the tree of terms to which T may reduce. 
Thus, a model in CLS is given by a term describing the initial state of the system and by a set of rewrite 
rules describing all the events that may occur. 

3 Typed Stochastic CLS 

In this section we show how types are used to enhance the expressivity of CLS. In particular, we use 
types to focus on quantitative aspects of CLS, by showing how to model the speeds of the biological 
activities. 

We classify elements in £ with types. Intuitively, given a molecule represented by an element a in 
$ , we associate a type to it which specifies the kind of the molecule. For an element a, we distinguish 
between occurrences of a single a in parallel with other terms, for which we use a basic type t, and 
occurrences of a within a sequence, for which we use a sequence type t. So, types specify the kind of 
elements and their positioning. In the following, with type, we mean either a basic type or a sequence 
type and we use t to range over both basic and sequence types. The metavariable x ranges over multi-sets 
of types. By t G„ X we denote that type t occurs n times in x, and l±l is the union on multisets. 
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Let r be a type assignment such that for a € $ , if r(a) = t, then t and ? are the types for a. The type 
of a term (or sequence) is the multiset of types (or sequence types) of its outermost component. This is 
formalised in the following definition of type of a term and stype of a sequence. 

Definition 3.1 (Mappings type and stype) The mappings type and stype are defined by induction on 
terms and sequences as follows: 

. - type((S) L \T)=stype(S) 

- type(T x | T 2 ) = type(Ti)Wtype(T 2 ) 

- type(S\ -S 2 ) = stype{S\ -S 2 ) 

- type (a) = {T(a)} 

• - stype{S\ -S 2 ) = stype (Si) &stype(S 2 ) 

- stype(a) = {r(a)} 

For example if T(a) = t a , T(b) = % and F(c) = t c we have type(a\a\c) = {t a ,t a ,t c }, type(b - c-c) = 
{tb,t c ,t c }, type(a\a\c \ (b-c-c) L \a) = {t a ,t a ,t c ,tb,t c ,t c } and type((b ■ c ■ c) L J (a\a\a\c)) = {t b ,t c ,t c }. 

Term transitions are labelled with a rate r, a real number, T —> T', modelling the speed of the transi- 
tion. The number r depends on the types and multiplicity of the elements interacting. 

To compute the rate of transitions we associate to each rule, P^P' the information which is relevant 
to the application of the rule. This is expressed by giving: 

• for each variable % in the pattern P, the types of the elements that influence the speed of the 
application of the rule, 

• a weighting function that combines the multiplicity of types on single variables, producing the 
final rate. 

We provide this information as follows. Given a pattern P, let V(P) = (xi, ■ ■ ■ ,Xm) be the list of (se- 
quence, term, and element) variables of P in left-to-right order of occurrence. 

• To each Xi we associate a list IT, = (t['\. . . ,tp)) of types, 

• Moreover, let : N q — > R be a function from a list of q = Y<\<i<mPi integers to a real. 
The rewrite rules of our Typed Stochastic CLS (TSCLS for short) are of the shape 

P^P' 

where IT = (ITi, . . . ,YL m ). 

For example as discussed in the following subsection the transformation of the element a into the 
element b inhibited by the presence of the element c can be described by the rule 

a \X > b\X (1) 



where = X n i n 2 . if n ^ ={ j^^ — x k , , and k,k' are the kinetic constant of the state change of a into b 
and the deceleration due to the presence of one inhibitor c, respectively. 

We consider local interactions, that is interactions between elements in the same compartment. When 
applying a rule, to take into account a whole compartment, we redefine the notion of context by enforcing 
the property that the hole of a context embraces a whole compartment as follows: 
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Definition 3.2 (Stochastic Contexts) Stochastic Contexts C are defined as: 

C::=D | T\(S) L \C 

where T £ 3~ and S 6 5?. We denote with the infinite set of stochastic contexts. 
We can now define the typed semantics. 

Definition 3.3 (Typed Stochastic Semantics) Given a finite set of rewrite rules St, the semantics of 
TSCLS is the least relation closed with respect to = and satisfying the following rule: 

P l ^P 2 e& a el Aa^e Cey? 
type(o(Xi)) = Ti tfe^Zi (l<j<Pi) (l<i<m) 
r = 0(n[ 1) ,...,4 1 i ,"i ,---,4m ) ) 

C[Piff] ^C[P 2 o] 

For example, applying rule (Q]) with the empty context to the term a \ a \ c we have: 

2xk Ixk 

a | a | c ► a | b \c > b\b\c 

and to the term a \ a \ c \ (b-c- c) L J a we have: 

2xk lxk 

a | a | c | (Z? • c ■ c) L J a lx > a|Z?|c| (b ■ c ■ c) L \a lx > ft|fe|c| (Z7-c-c) L Ja 
Applying ([T]) with the context e | (b ■ c ■ c) L J □ to the term (b ■ c ■ c) L J a | a \ a | c we get: 

3x^ 2xt IxA- 

(ft • c • c) L J a | a | a | c lx > (Z? • c • c) L \a\a\b\c lx > (^•c-c) i Ja|ft|^|c lx > (Z? • c • c) L J Z> | ft | | c 

Note that we cannot use Definition [23] for contexts, since we would not count correctly the numbers of 
elements which influence the speed of transformations. For example, again rule dTJ applied to the term 
a | a | c with the context □ | a \ c would produce the wrong transition: 

a | a | c — * a \ b \ c. 

Given the Continuous Time Markov Chain (CTMC) obtained from the transition system resulting 
from our typed stochastic semantics, we can follow a standard simulation procedure. Roughly speaking, 
the algorithm starts from the initial term (representing a state of the CTMC) and performs a sequence 
of steps by moving from state to state. At each step a global clock variable (initially set to zero) is 
incremented by a random quantity which is exponentially distributed with the exit rate of the current 
state as parameter, and the next state is randomly chosen with a probability proportional to the rates of 
the exit transitions. 

The race condition described above implements the fact that, on the lines of Gillespie's algorithm lfT31 . 
when different reactions are competing with different rates, the ones which are not chosen should restart 
the competition at the following step. 
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3.1 Modelling Guidelines 

In the remain of this section we will put at work the TSCLS calculus in order to model biomolecular 
events of interest. 

• The application rate in the case of the change of state of an elementary object is proportional to 
the number of objects which are present. For this reason if t a is the type of the object a and k is 
the kinetic constant of the state change of a into b we can describe this chemical reaction by the 
following rewrite rule: 

a X > b X 

where <p = Xn.(n + 1) x k. Using this rule we get for example: 

(m) L J (a | a | a) — > (m) L J (b \ a \ a) 
(m) L J (a | a ■ a) —> (m) L J (b \ a ■ a) 

where m is any membrane. 

• In the process of complexation, two elementary objects in the same compartment are combined to 
produce a new object. The application rate is then proportional to the product of the numbers of 
occurrences of the two objects. Assuming that t a and % are the types of a and b we get: 

a\b\X > c\X 

where <p = Xn\ri2.{n\ + \ ) x («2 + 1) X k and k is the kinetic constant of the modelled chemical 
reaction. 

Using the same conventions a similar and simpler rule describes decomplexation: 

c\X^la\b\X 

where = Xn.{n + 1) x k. 

• Another phenomenon which can be easily rendered in our formalism is the osmosis regulating the 
quantity of water inside and outside a cell for a dilute solution of non-dissociating substances. In 
fact in this case according to ll26l the total flow is L p ^A\j/ w , where L p is the hydraulic conductivity 
constant, which depends on the semi-permeability properties of the membrane, 5 is the surface 
of the cell, V is the volume of the cell, Ay w = Yw(ext) ~ Wwlint) i s the difference between the 
water potentials outside and inside the cell. The water potential for non-dissociating substances 
is the sum of the solute potential y s = — RTc s (where R is the gas constant, T is the absolute 
temperature and c s is the solute concentration) and the pressure potential \j/ p (which depends on 
the elastic properties of the membrane and on the cell wall). We can therefore consider the rate of 
flow of water proportional (via a constant k) to ^(c s t ext -\ — c s t irtt }), where the sign of this real gives 
the direction of the flow. The membrane crossing of the element a according to the concentration 
of the elements b inside and outside the cell is given by the pairs of rules: 

(x) J (a | a) | j ► (jc) J a | a | / 

(x) \X\a\Y > (x) J (a I a) | Y 
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where ^-^iW 4 .|x( ^ ^ )xk 

elements a and respectively. 

The positive catalysis of osmosis by the presence of elements c on the membrane is rendered by: 

(xj J (a | a) | r > (x) J a | a | r 



(i) L jz|a|y (i) L j(z|a)|y 

0' 

= \n x n 2 n^n A n*,.(n x x fc c + 1) x f x ( , , v - , + v ) x 

where x , , . , \ ^+ 1 ^«+"3»'* (n 4 +L)Va+n s v b and fc c is the ac- 

= Anin 2 n 3 n 4 n5.(ni x k c + 1) x A x ( (w4+1) £ +BiVt - (^+1)^^ ) x k 
celeration due to the presence of one element c. 

Similarly the inhibition of osmosis by the presence of elements c on the membrane is rendered by: 

(xj J (a I a J 1 1 ► (x) J A | a | y 



(x) jx|a|y >(x0 J(x|a)|y 

0' 

where ^ = ^i" 2 " 3 " 4 " 5 " if »i=0 tol else m xk c X V X ^ (n 2 +l^ a+ n iVb ~ («4+l)V fl +« 3 Vi ) X k „ nf1 u :„ 
wueie , , j ^ . n5 „ 3 . aiiu /c c is 

<p - Anin 2 n 3 n4n 5 . i{ „ 1=0then x e i seniXfcc x ? x (, ( , l4+1) y a+n5l / 4 - ( n2 +i)v a +n 3 vJ x fc 
the deceleration due to the presence of one element c. 

If the rule 



describes an event, in order to express that this event is positively catalysed by an element c we 
can modify the rewrite rule as follows. 

If P\ = P[\X, the type list of X is Tlx and the weighting function (j) is XnlTx.e, where n takes into 
account the elements occurring in P[ and n~x takes into account the elements occurring in X, we 
define: 

- as the list whose head is t c and whose tail is Tlx, 

- 0' = Xnn c nx~.e x (n c x k + 1), 

where £ is the acceleration due to the presence of one positive catalyser c. The new rule is obtained 
from the old one by replacing IT^ and (j)' to Tlx and 0, respectively. 
Otherwise if Pi ^ P[ \ X, the new rule is: 

Pl \x^lp 2 \x 

0' 

where ^ represents list concatenation and if = Arc.e, then (j)' = Xnn c .e x {n c xk+\). 

Similarly we can represent the effect of an inhibitor just replacing the inserted multiplications by 
divisions. We can also represent in one rule both positive and negative catalysers. For example to 

add the effect of a positive catalyser c and an inhibitor d to the rule Pi P 2 if Pi =P[\X and Tlx, 



are as above we define: 



M. Dezani-Ciancaglini, P. Giannini and A. Troina 



99 



- w x = (t c ,t d )-n x , 

- f = ^rm c n d W.e X if „, =0 ^fl + eLe,vx^ 

where k is the acceleration due to the presence of one positive catalyser c and k 1 is the deceleration 
due to the presence of one inhibitor d. 

Otherwise if Pi ^ P[ \X, the new rule is: 

1 f 1 

where if = Xn.e, then 0' = Ann c n rf . g if n<j=0 ^f+| se n ^ . 

Looking at the previous examples, we claim that our formalism enlightens better than other formalisms 
the duality between the roles of positive and negative catalysers. 

4 An Application: The Lactose Operon 

To show that our framework can be easily used to model and simulate cellular pathways, we give a model 
of the well-known regulation process of the lactose operon in Escherichia coli. 

E. coli is a bacterium often present in the intestine of many animals. It is one of the most completely 
studied of all living things and it is a favourite organism for genetic engineering. Cultures of E. coli can 
be made to produce unlimited quantities of the product of an introduced gene. As most bacteria, E.coli is 
often exposed to a constantly changing physical and chemical environment, and reacts to changes in its 
environment through changes in the kinds of enzymes it produces. In order to save energy, bacteria do not 
synthesize degradative enzymes unless the substrates for these enzymes are present in the environment. 
For example, E. coli does not synthesize the enzymes that degrade lactose unless lactose is in the envi- 
ronment. This result is obtained by controlling the transcription of some genes into the corresponding 
enzymes. 

Two enzymes are involved in lactose degradation: the lactose permease, which is incorporated in the 
membrane of the bacterium and actively transports the sugar into the cell, and the beta galactosidase, 
which splits lactose into glucose and galactose. The bacterium produces also the transacetylase enzyme, 
whose role in the lactose degradation is marginal. 

The sequence of genes in the DNA of E. coli which produces the described enzymes, is known as the 
lactose operon. 

The first three genes of the operon (i, p and o) regulate the production of the enzymes, and the 
last three (z, y and a), called structural genes, are transcribed (when allowed) into the mRNA for beta 
galactosidase, lactose permease and transacetylase, respectively. 

The regulation process is as follows (see Figure 0): gene i encodes the lac Repressor, which, in 
the absence of lactose, binds to gene o (the operator). Transcription of structural genes into mRNA is 
performed by the RNA polymerase enzyme, which usually binds to gene p (the promoter) and scans 
the operon from left to right by transcribing the three structural genes z, y and a into a single mRNA 
fragment. When the lac Repressor is bound to gene o, it becomes an obstacle for the RNA polymerase, 
and the transcription of the structural genes is not performed. On the other hand, when lactose is present 
inside the bacterium, it binds to the Repressor and this cannot stop anymore the activity of the RNA 
polymerase. In this case the transcription is performed and the three enzymes for lactose degradation are 
synthesized. 
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Figure 2: The regulation process in the Lac Operon. 



4.1 Typed Stochastic CLS Model 

A detailed mathematical model of the regulation process can be found in (28j. It includes information 
on the influence of lactose degradation on the growth of the bacterium. 

We give a TSCLS model of the gene regulation process, with stochastic rates taken from [27). We 
model the membrane of the bacterium as the looping sequence (m) L , where the alphabet symbol m 
generically denotes the whole membrane surface in normal conditions. Moreover, we model the lactose 
operon as the sequence lad ■ lacP ■ lacO ■ lacZ ■ lacY ■ lacA (loci— A for short), in which each symbol 
corresponds to a gene. We replace lacO with RO in the sequence when the lac Repressor is bound to 
gene o, and lacP with PP when the RNA polymerase is bound to gene p. When the lac Repressor and 
the RNA polymerase are unbound, they are modelled by the symbols repr and polym, respectively. We 
model the mRNA of the lac Repressor as the symbol Irna, a molecule of lactose as the symbol LACT , and 
beta galactosidase, lactose permease and transacetylase enzymes as symbols betagal , perm and transac, 
respectively. Finally, since the three structural genes are transcribed into a single mRNA fragment, we 
model such mRNA as a single symbol Rna. 

The transcription of the DNA, the binding of the lac Repressor to gene o, and the interaction between 
lactose and the lac Repressor are modelled by the following set of stochastic typed rewrite rules: 

lacI-A | X lacI-A \ Irna \X (Rl ) 



where = 0.02. 

Irna \ X - — > Irna \repr\X (R2) 



where t is the type of Irna and <p = Xn.(n + 1) x 0.1. 

lacI—A | polym \ X - — > lad ■ PP ■ lacO ■ lacZ ■ lacY ■ lacA \ X (R3) 


where t is the type of polym and = Xn.(n + 1) x 0.1. 

lad ■ PP ■ lacO ■ lacZ ■ lacY ■ lacA | X — lacI—A | polym \ X (R4) 
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where = 0.01. 

lacl ■ PP ■ lacO ■ lacZ ■ lacY ■ lacA \ X lacI—A \ polym \ Rna \ X (R5) 



where = 20. 

Rna | X -^-^ 7?«a | betagal \ perm \ transac \ X (R6) 



where t is the type of Rna and = + 1) x 0.1. 

lacI—A \repr\X lacl ■ lacP ■ RO ■ lacZ ■ lacY ■ lac A \ X (R7) 



where t is the type of repr and = Xn.(n + 1) x 1. 

lacl ■ PP ■ lacO ■ lacZ ■ lacY ■ lacA \ repr \ X lacl PP RO- lacZ ■ lacY ■ lacA \ X (R8) 



where t is the type of repr and = Xn.{n + 1) x 1. 

lacl ■ lacP ■ RO ■ lacZ ■ lacY ■ lacA \ X lacI-A \ repr \ X (R9) 



where = 0.01. 

lacl PPRO- lacZ ■ lacY ■ lacA \ X lacl ■ PP ■ lacO ■ lacZ ■ lacY ■ lacA \ repr \ X (RIO) 



where = 0.01. 

repr \ LACT \X {{ ' r — > RLACT \ X (Rll) 



where t r and t\ are the types of repr and LACT and = Ani«2-(«i + 1) x ("2 + 1) x 0.005. 

RLACT \X ^ repr \ LACT \X (R12) 



where t is the type of RLACT and = Xn.(n + 1) x 0.1. 

Rules (Rl) and (R2) describe the transcription and translation of gene i into the lac Repressor (as- 
sumed for simplicity to be performed without the intervention of the RNA polymerase). Rules (R3) and 
(R4) describe binding and unbinding of the RNA polymerase to gene p. Rules (R5) and (R6) describe the 
transcription and translation of the three structural genes. Transcription of such genes can be performed 
only when the sequence contains lacO instead of RO, that is when the lac Repressor is not bound to gene 
o. Rules (R7)-(R10) describe binding and unbinding of the lac Repressor to gene o. Finally, rules (Rll) 
and (R12) describe the binding and unbinding, respectively, of the lactose to the lac Repressor. 

The following rules describe the behaviour of the three enzymes for lactose degradation: 

(I) L J (perm \X)\Y {{) ' {t) ' {) \ (perm -x) L J X \ Y (Rl 3) 



where t is the type of perm and = Xn.(n + 1) x 0.1. 
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{I) L J X | LACT | Y (fa> '°' <?/>) ; (£) L J (LACT \X)\Y (R14) 
where ? p and t\ are the types of perm and LACT, respectively, and = A«i«2-«i x ("2 + 1) x 0.001. 

LACT | X GLU | GAL | X (R15) 



where f/ and % are the types of LACT and betagal, and = Anin2.(ni + 1) x n 2 x 0.001. 

Rule (R13) describes the incorporation of the lactose permease in the membrane of the bacterium, 
rule (R14) the transportation of lactose from the environment to the interior performed by the lactose 
permease, and rule (R15) the decomposition of the lactose into glucose (denoted GLU) and galactose 
(denoted GAL) performed by the beta galactosidase. 

The initial state of the bacterium when no lactose is present in the environment and when 100 
molecules of lactose are present are modelled, respectively, by the following terms (where nxT stands 
for a parallel composition T | . . . | T of length n): 

Ecoli ::= (m) L J (lacI-A | 30 x polym | 100 x repr) (2) 
EcoliLact ::= Ecoli | 100 x LACT (3) 

Now, starting from the term EcoliLact, a possible stochastic trace generated by our semantics, given 
the rules above, i^: 

EcoliLact R3 ' 30x01 > 100 x LACT \ (m) L J {lad ■ PP ■ lacO ■ lacZ ■ lacY ■ lacA \ 29 x polym 1 100 x repr) 

100 x LACT | (m) L J (lacI-A | 30 x polym | 100 x repr \ Rna) 
— + 100 x LACT | (m) J (lacI-A 1 30 x polym 1 100 x repr\ Rna \ betagal \ perm \ transac) 
100 x LACT | (perm-m) L \ (lacI-A \ 30 x polym \ 100 x repr \ Rna \ betagal \ transac) 



; : — ► 99 x LACT | [perm-mf\ (lacI-A \ 30 x polym \ 100 x repr \ Rna \ betagal \ transac \ LACT ) 

— — ► 99 x LACT | [perm-mf J (lacl-A \ 30 x polym \ 100 x repr \ Rna \ betagal \ transac \ GLU \ GAL) 



5 Conclusions and Related Work 

This paper is a first proposal for using types in describing quantitative aspects of biological systems. 
Types for qualitative properties of the CSL calculus have been studied in [2] and lfl4l . We plan to 
develop a prototype simulator for our calculus TSCLS in order to experimentally test the expressiveness 
of our formalism. This would make possible to compare quantitatively the approach presented in this 
paper, with the one of O. 

In the remaining of this section we will put our paper in the framework of qualitative and quantitative 
models of biological systems. 

'For simplicity we just show the rate of the transition reaching the target state considered in the trace. We avoid to report 
explicitly the whole exit rate from a given term, which should be computed, following the standard simulation algorithm, by 
summing up the rates for all the possible target states. For the sake of readability, we also show, on the transitions, the labels of 
the rules leading the state change. 
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Qualitative Models. In the last few years many formalisms originally developed by computer scientists 
to model systems of interacting components have been applied to Biology. Among these, there are Petri 
Nets [18 ], Hybrid Systems JT), and the 7T-calculus lfTTll25l . Moreover, new formalisms have been defined 
for describing biomolecular and membrane interactions lH[9l[T0l[T3]|22l[24]]. Others, such as P-Sy stems 
|2D . have been proposed as biologically inspired computational models and have been later applied to 
the description of biological systems. 

The 7r-calculus and new calculi based on it ll22l [24 1 have been particularly successful in the descrip- 
tion of biological systems, as they allow describing systems in a compositional manner. Interactions 
of biological components are modelled as communications on channels whose names can be passed; 
sharing names of private channels allows describing biological compartments. 

These calculi offer very low-level interaction primitives, but may cause the description models to 
become very large and difficult to read. Calculi such as those proposed in (9j |T0l [131 give a more 
abstract description of systems and offer special biologically motivated operators. However, they are 
often specialized to the description of some particular kinds of phenomena such as membrane interactions 
or protein interactions. 

P-Sy stems [21] have a simple notation and are not specialized to the description of a particular class 
of systems, but they are still not completely general. For instance, it is possible to describe biological 
membranes and the movement of molecules across membranes, and there are some variants able to 
describe also more complex membrane activities. However, the formalism is not so flexible to allow 
describing easily new activities observed on membranes without extending the formalism to model such 
activities. 

Danos and Laneve [13] proposed the fc-calculus. This formalism is based on graph rewriting where 
the behaviour of processes (compounds) and of set of processes (solutions) is given by a set of rewrite 
rules which account for, e.g., activation, synthesis and complexation by explicitly modelling the binding 
sites of a protein. 

The Calculus of Looping Sequences [4] has no explicit way to model protein domains (however 
they can be encoded, and a variant with explicit binding has been defined in |£3j), but accounts for an 
explicit mechanism (the looping sequences) to deal with compartments and membranes. Thus, while the 
fc-calculus seems more suitable to model protein interactions, CLS allows for a more natural description 
of membrane interactions. Another feature lacking in other formalisms is the capacity to express ordered 
sequences of elements. To the best of our knowledge, CLS is the first formalism offering such a feature 
in an explicit way, thus allowing to naturally operate over proteins or DNA fragments which should be 
frequently defined as ordered sequences of elements. 

Stochastic Models. Among stochastic process algebras we would like to mention the stochastic ex- 
tension of the 7r-calculus, given by Priami et al. in |[23l . and the PEPA framework proposed by Hillston 
in |fl61 . We also would like to compare our work with two closer ones, namely (6j and (8). 

The stochastic engine behind PEPA and the Stochastic n -calculus is constructed on the intuition of 
cooperating agents under different bandwidth limits. If two agents are interacting, the time spent for a 
communication is given by the slowest of the agents involved. Differently, our stochastic semantics is 
defined in terms of the collision-based paradigm introduced by Gillespie. A similar approach is taken 
in the quantitative variant of the fc-calculus ([12J) and in BioSPi ([23]). Motivated by the law of mass 
action, here we need to count the number of the reactants present in a system in order to compute the 
exact rate of a reaction. In lfl7l . a stochastic semantics for bigraphs has been developed. An application 
in the field of systems biology has been provided by modelling a process of membrane budding. 
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A stochastic semantics for CLS (SCLS) has been defined in Q. Such a semantics computes the 
transition rates by resorting to a complete counting mechanism to detect all the possible occurrences of 
patterns within a term. In our framework, the set of rule schemata that can be defined is limited with 
respect to SCLS, however, our counting mechanism, based on types, is quite more simple in practice. 
This would simplify, for example, the development of automatic simulators. As another advantage, 
our rules, similar to what happens in El for a variant of the ambient calculus, are equipped with rate 
functions, rather than with rate constants. Such functions may allow the definition of kinetics that are 
more complex than the standard mass-action ones. 

Bioambients, [24], is a calculus in which biological systems are modelled using a variant of the 
ambient calculus. In Bioambients both membranes and elements are modelled by ambients, and activities 
by capabilities (enter, exit, expel, etc.). In (H, Bioambients are extended by allowing the rates associated 
with rules to be context dependent. Dependency is realized by associating to a rule a function which is 
evaluated when applying the rule, and depends on the context of the application. The context contains 
(as for our stochastic contexts) the state of the sibling ambients, that is the ambients in parallel in the 
innermost enclosing ambient (membrane). The property of the context used to determine the value of 
the function is its volume that synthesizes (with a real number) the elements present in the context. In 
Section [3] we sketched the representation of osmosis in our framework: the same example is presented 
with all details in [8]. However, our modelling is more general allowing to focus more selectively on 
context, and specifying functions that may also cause inhibition. 

Finally MGS, http : //mgs . spatial-computing . org/, is a domain specific language for simula- 
tion of biological processes. The state of a dynamical system is represented by a collection. The elements 
in the collection represent either entities (a subsystem or an atomic part of the dynamical system) or mes- 
sages (signal, command, information, action, etc.) addressed to an entity. The dynamics is defined by 
rewrite rules specifying the collection to be substituted through a pattern language based on the neigh- 
borhood relationship induced by the topology of the collection. It is possible to specify stochastic rewrite 
strategies. In EOl . this feature is used to provide the description of various models of the genetic switch 
of the X phage, from a very simple biochemical description of the process to an individual-based model 
on a Delaunay graph topology. Note that, in MSG, the topological changes are programmed in some 
external language, whereas in CLS they are specified directly by the rewrite rules. 
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